{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Least-squares"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In a least-squares, or linear regression, problem, we have measurements $A \\in \\mathcal{R}^{m \\times n}$ and $b \\in \\mathcal{R}^m$ and seek a vector $x \\in \\mathcal{R}^{n}$ such that $Ax$ is close to $b$. Closeness is defined as the sum of the squared differences:\n",
    "$$ \\sum_{i=1}^m (a_i^Tx - b_i)^2, $$\n",
    "also known as the $\\ell_2$-norm squared, $\\|Ax - b\\|_2^2$.\n",
    "\n",
    "For example, we might have a dataset of $m$ users, each represented by $n$ features. Each row $a_i^T$ of $A$ is the features for user $i$, while the corresponding entry $b_i$ of $b$ is the measurement we want to predict from $a_i^T$, such as ad spending. The prediction is given by $a_i^Tx$.\n",
    "\n",
    "We find the optimal $x$ by solving the optimization problem\n",
    "$$  \n",
    "    \\begin{array}{ll}\n",
    "    \\mbox{minimize}   & \\|Ax - b\\|_2^2.\n",
    "    \\end{array}\n",
    "$$\n",
    "Let $x^\\star$ denote the optimal $x$. The quantity $r = Ax^\\star - b$ is known as the residual. If $\\|r\\|_2 = 0$, we have a perfect fit."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Example"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the following code, we solve a least-squares problem with CVXPY."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\n",
      "The optimal value is 7.005909828287484\n",
      "The optimal x is\n",
      "[ 0.17492418 -0.38102551  0.34732251  0.0173098  -0.0845784  -0.08134019\n",
      "  0.293119    0.27019762  0.17493179 -0.23953449  0.64097935 -0.41633637\n",
      "  0.12799688  0.1063942  -0.32158411]\n",
      "The norm of the residual is  2.6468679280023557\n"
     ]
    }
   ],
   "source": [
    "# Import packages.\n",
    "import cvxpy as cp\n",
    "import numpy as np\n",
    "\n",
    "# Generate data.\n",
    "m = 20\n",
    "n = 15\n",
    "np.random.seed(1)\n",
    "A = np.random.randn(m, n)\n",
    "b = np.random.randn(m)\n",
    "\n",
    "# Define and solve the CVXPY problem.\n",
    "x = cp.Variable(n)\n",
    "cost = cp.sum_squares(A*x - b)\n",
    "prob = cp.Problem(cp.Minimize(cost))\n",
    "prob.solve()\n",
    "\n",
    "# Print result.\n",
    "print(\"\\nThe optimal value is\", prob.value)\n",
    "print(\"The optimal x is\")\n",
    "print(x.value)\n",
    "print(\"The norm of the residual is \", cp.norm(A*x - b, p=2).value)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
